-
-
Notifications
You must be signed in to change notification settings - Fork 31
Uk national #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Uk national #157
Conversation
…nal-and-gsp-locations # Conflicts: # src/quartz_api/cmd/main.py # src/quartz_api/internal/backends/dataplatform/client.py # src/quartz_api/internal/models.py
…nal-and-gsp-framework # Conflicts: # src/quartz_api/cmd/main.py
…-locations # Conflicts: # src/quartz_api/cmd/main.py # src/quartz_api/internal/service/uk_national/gsp.py # src/quartz_api/internal/service/uk_national/national.py # src/quartz_api/internal/service/uk_national/system.py
# Conflicts: # src/quartz_api/internal/service/uk_national/national.py
# Conflicts: # src/quartz_api/internal/service/uk_national/national.py
# Conflicts: # src/quartz_api/internal/service/uk_national/gsp.py # src/quartz_api/internal/service/uk_national/national.py # src/quartz_api/internal/service/uk_national/system.py
# Conflicts: # src/quartz_api/internal/backends/dataplatform/client.py
# Conflicts: # src/quartz_api/internal/service/uk_national/description.py
# Conflicts: # pyproject.toml # src/quartz_api/cmd/main.py # src/quartz_api/internal/backends/dataplatform/client.py # src/quartz_api/internal/backends/dummydb/client.py # src/quartz_api/internal/backends/quartzdb/client.py # src/quartz_api/internal/models/__init__.py # src/quartz_api/internal/models/db_interface.py # uv.lock
| repr(sorted(params)), | ||
| ]) | ||
|
|
||
| log.info(f"Cache key generated: {key}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably remove
# Conflicts: # src/quartz_api/internal/backends/dataplatform/client.py # src/quartz_api/internal/models/endpoint_types.py
devsjc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a finished review, haven't got all the way down the files yet!
| def get_window( | ||
| start: datetime | None = None, end:datetime | None = None, | ||
| ) -> tuple[datetime, datetime]: | ||
| """Returns the start and end of the window for timeseries data.""" | ||
| # Window start is the beginning of the day two days ago | ||
| start = (dt.datetime.now(tz=dt.UTC) - dt.timedelta(days=2)).replace( | ||
| hour=0, | ||
| minute=0, | ||
| second=0, | ||
| microsecond=0, | ||
| ) | ||
| if start is None: | ||
| start = (datetime.now(tz=UTC) - timedelta(days=2)) | ||
| start = floor_6_hours_dt(start) | ||
|
|
||
| # Window end is the beginning of the day two days ahead | ||
| end = (dt.datetime.now(tz=dt.UTC) + dt.timedelta(days=2)).replace( | ||
| hour=0, | ||
| minute=0, | ||
| second=0, | ||
| microsecond=0, | ||
| ) | ||
| if end is None: | ||
| end = start + timedelta(days=4) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be tempted to get rid of this altogether and just mandate that all the timeseries routes have a start and an end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, somewhere the logic has to sit, as people pull API with no start or end datetimes in current uk-national api, so we have to make them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible with FastAPI to just have them in the routes but make them optional such that they default to current time +/- 2 days or whatever it's meant to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, there is probably a way to do that. So move it from the backend to the routes (and specifically the uk-national and ruvnl region routes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say so - would save having to have this file at least!
| model_name: str | None = None, | ||
| start_datetime: datetime | None = None, | ||
| end_datetime: datetime | None = None, | ||
| created_utc_upper_limit: datetime | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is created_before_datetime more in line with the other fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, im happy to change it to that
| class Region(BaseModel): | ||
| """Region metadata.""" | ||
|
|
||
| region_name: str = Field(..., json_schema_extra={"description": "The name of the region."}) | ||
| region_metadata: dict | None = Field( | ||
| None, | ||
| json_schema_extra={"description": "Additional metadata about the region."}, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not be a subclass of LocationPropertiesBase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do, does LocationPropertiesBase force a lat/lon where as a Region doesnt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything has a lat/lon, even regions, as they have a centroid/associated point. So it's not a problem for region to have a lat/lon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually tempted to leave this, its just been moved from below, and tidy this up another time.
| func: Callable, # noqa | ||
| namespace: str = "", | ||
| *, | ||
| request: Request = None, | ||
| response: Response = None, # noqa | ||
| args, # noqa | ||
| kwargs, # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need all these #noqa's? You have some type hinting - also you never use args or kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you need a #noqa per line, but i could be wrong.
I look into removing the ones I dont need, but i might be the cache function needs certain inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do need it per line, but what I meant was, are they doing anything? If you remove the unused arguments does something complain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill check, it might be the cache function has to have them, but ill check
| log = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| async def key_builder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo add permission in cache key, for intraday users
| ) -> list[models.PredictedPower]: | ||
| values = await self._get_predicted_power_production_for_location( | ||
| location_uuid=UUID(location), | ||
| location_uuid=location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be safer to make the location input to the function a UUID type? Then we can enforce it at the route validation level
| forecast_horizon_minutes=forecast_horizon_minutes, | ||
| smooth_flag=smooth_flag, | ||
| oauth_id=None, | ||
| model_name=model_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use forecaster now (or try to!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i will change that
| async def get_forecast_metadata( | ||
| self, | ||
| location_uuid: str, | ||
| authdata: dict[str, str], #noqa: ARG002 | ||
| model_name: str | None = None, | ||
| ) -> models.ForecastMetadata: | ||
| """Get forecast metadata for a site.""" | ||
| req = dp.GetLatestForecastsRequest( | ||
| location_uuid=location_uuid, | ||
| energy_source=dp.EnergySource.SOLAR, | ||
| ) | ||
| resp = await self.dp_client.get_latest_forecasts(req) | ||
|
|
||
| # Filter by model name if provided | ||
| if model_name: | ||
| resp.forecasts = [ | ||
| forecast for forecast in resp.forecasts | ||
| if forecast.forecaster.forecaster_name == model_name | ||
| ] | ||
|
|
||
| resp.forecasts.sort( | ||
| key=lambda f: f.created_timestamp_utc, | ||
| reverse=True, | ||
| ) | ||
| forecasts = resp.forecasts[0] | ||
|
|
||
| return models.ForecastMetadata( | ||
| initialization_timestamp_utc=forecasts.initialization_timestamp_utc, | ||
| created_timestamp_utc=forecasts.created_timestamp_utc, | ||
| forecaster_name=forecasts.forecaster.forecaster_name, | ||
| forecaster_version=forecasts.forecaster.forecaster_version, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire function just to return an init_timestamp of the latest forecast seems unecessary and ripe for overuse; surely everything that wants this can just get it as part of their response for their own query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its also use in the national route where we return metadata, so i think we need all this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my other comments I tried to describe that the metadata that is fetched by this route is already surfaced by the Data Platform's GetForecastAsTimeseries route. So all that has to be done is for that data to be surfaced in the _get_solar_power_production_for_timestamp DBClient function and it can be passed into the output of the national route without making any extra calls to the database.
| @override | ||
| async def get_forecast_for_multiple_locations( | ||
| self, | ||
| location_uuids_to_location_ids: dict[str, int], | ||
| authdata: dict[str, str], | ||
| start_datetime_utc: dt.datetime | None = None, | ||
| end_datetime_utc: dt.datetime | None = None, | ||
| model_name: str | None = None, | ||
| ) -> list[models.OneDatetimeManyForecastValuesMW, | ||
| ]: | ||
| """Get a forecast for multiple sites. | ||
| Args: | ||
| location_uuids_to_location_ids: A mapping from location UUIDs to location IDs. | ||
| authdata: Authentication data for the user. | ||
| start_datetime_utc: The start datetime for the prediction window. Default is None. | ||
| end_datetime_utc: The end datetime for the prediction window. Default is None. | ||
| model_name: The name of the forecasting model to use. Default is None. | ||
| Returns: | ||
| A list of OneDatetimeManyForecastValuesMW objects. | ||
| """ | ||
| start, end = get_window(start=start_datetime_utc, end=end_datetime_utc) | ||
|
|
||
| # timestamps 30 mins apart from start to end | ||
| n_half_hours = int((((end - start).total_seconds() // 60) // 30) + 1) | ||
| timestamps = [start + dt.timedelta(minutes=30 * x) for x in range(n_half_hours)] | ||
|
|
||
| # get forecasters | ||
| req = dp.ListForecastersRequest(forecaster_names_filter=[model_name], | ||
| latest_versions_only=True) | ||
| resp = await self.dp_client.list_forecasters(req) | ||
| forecaster = resp.forecasters[0] | ||
|
|
||
| forecasts_per_timestamp = [] | ||
| tasks = [] | ||
| for timestamp in timestamps: | ||
| req = dp.GetForecastAtTimestampRequest( | ||
| location_uuids=list(location_uuids_to_location_ids.keys()), | ||
| energy_source=dp.EnergySource.SOLAR, | ||
| timestamp_utc=timestamp, | ||
| forecaster=forecaster, | ||
| ) | ||
| # resp = await self.dp_client.get_forecast_at_timestamp(req) | ||
| task = asyncio.create_task(self.dp_client.get_forecast_at_timestamp(req)) | ||
| tasks.append(task) | ||
| list_results = await asyncio.gather(*tasks, return_exceptions=True) | ||
| for exc in filter(lambda x: isinstance(x, Exception), list_results): | ||
| raise exc | ||
|
|
||
| for resp in list_results: | ||
|
|
||
| if len(resp.values) == 0: | ||
| continue | ||
|
|
||
| forecasts_one_timestamp = models.OneDatetimeManyForecastValuesMW( | ||
| datetime_utc=resp.timestamp_utc, | ||
| forecast_values={ | ||
| location_uuids_to_location_ids[forecast.location_uuid]: round( | ||
| forecast.value_fraction * forecast.effective_capacity_watts / 10**6, 2) | ||
| for forecast in resp.values | ||
| }) | ||
|
|
||
| # sort by dictionary by keys | ||
| forecasts_one_timestamp.forecast_values =\ | ||
| dict(sorted(forecasts_one_timestamp.forecast_values.items())) | ||
|
|
||
| forecasts_per_timestamp.append(forecasts_one_timestamp) | ||
|
|
||
| return forecasts_per_timestamp | ||
|
|
||
| @override | ||
| async def get_generation_for_multiple_locations( | ||
| self, | ||
| location_uuids_to_location_ids: dict[str, int], | ||
| authdata: dict[str, str], | ||
| start_datetime: dt.datetime | None = None, | ||
| end_datetime: dt.datetime | None = None, | ||
| observer_name: str = "ruvnl", | ||
| ) -> list[models.GSPYieldGroupByDatetime]: | ||
| """Get a forecast for multiple sites.""" | ||
| start, end = get_window(start=start_datetime, end=end_datetime) | ||
|
|
||
| tasks = [] | ||
| for location_uuid in location_uuids_to_location_ids: | ||
| req = dp.GetObservationsAsTimeseriesRequest( | ||
| location_uuid=location_uuid, | ||
| observer_name=observer_name, | ||
| energy_source=dp.EnergySource.SOLAR, | ||
| time_window=dp.TimeWindow( | ||
| start_timestamp_utc=start, | ||
| end_timestamp_utc=end, | ||
| ), | ||
| ) | ||
| task = asyncio.create_task(self.dp_client.get_observations_as_timeseries(req)) | ||
| tasks.append(task) | ||
| # observation = await self.dp_client.get_observations_as_timeseries(req) | ||
| # observations.append(observation) | ||
|
|
||
| list_results = await asyncio.gather(*tasks, return_exceptions=True) | ||
| for exc in filter(lambda x: isinstance(x, Exception), list_results): | ||
| raise exc | ||
|
|
||
| # Combine results into GSPYieldGroupByDatetime | ||
| observations_by_datetime = {} | ||
| for observation in list_results: | ||
|
|
||
| location_id = location_uuids_to_location_ids[observation.location_uuid] | ||
|
|
||
| for value in observation.values: | ||
| timestamp = value.timestamp_utc | ||
| if timestamp not in observations_by_datetime: | ||
| # make a dictionary generation_kw_by_gsp_id | ||
| observations_by_datetime[timestamp] = {} | ||
|
|
||
| generation_kw = int(value.effective_capacity_watts * value.value_fraction / 1000.0) | ||
| observations_by_datetime[timestamp][location_id] = generation_kw | ||
|
|
||
| # format to list of GSPYieldGroupByDatetime | ||
| observations_by_datetime_formated = [ | ||
| models.GSPYieldGroupByDatetime( | ||
| datetime_utc=timestamp, | ||
| generation_kw_by_gsp_id=dict(sorted(generation_kw_by_gsp_id.items())), | ||
| ) | ||
| for timestamp, generation_kw_by_gsp_id in observations_by_datetime.items() | ||
| ] | ||
| return observations_by_datetime_formated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think either of these functions should exist. I'm aware they might have to for this specific PR, but it goes against the desiagn of the data platform to do these massive 2D calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should i put it in the route then? and then its less likely to be re-used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's possibly sensible actually
| @override | ||
| async def get_forecast_metadata() -> models.ForecastMetadata: | ||
| raise NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be needed
| @@ -1,22 +1,41 @@ | |||
| """Utility functions...""" | |||
|
|
|||
| import datetime as dt | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why modify this? It's now different to every other file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can change this back.
| resp = await self.dp_client.list_forecasters(req, metadata={"traceid": traceid}) | ||
| forecaster = resp.forecasters[0] | ||
|
|
||
| req = dp.GetForecastAsTimeseriesRequest( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response to this returns the created_at and init_time of the forecast that produced each value in the timeseries. If we surfaced this into the models.PredictedPower dataclass, there would be no need for the get_forecast_metadata route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, thats one way to do it.
The tricky thing is, sometimes we want this metadata, and sometimes we dont. The real stripped down version should not get it, we should just get a timeseries. This is like an option extra, that users can request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then you already have it regardless, so you might as well pass it up to the top - and then put the conditional solely on writing it back to the user at the very end of the route logic. There is no "stripping down" to be achieved by not passing it through; the data platform returns it regardless every time. In fact, all the "stripped down" version does at the moment is not do the extra (unecessary) call to the database via get_forecast_metadata (so really it's the other version that's "stripped up"!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so does there need to be a change in data-platform for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the data platform is already returning the init time and created time alongside every forecast value when you call GetForecastAsTimeseries. All that has to be done is for the DBClient's _get_predicted_solar_power_for_location function to actually propogate these values up the stack in their returned dataclasses. The National route can then either show or not show those values based on user preference.
| forecast = await db.get_forecast_metadata( | ||
| location_uuid=national_location_uuid, | ||
| authdata={}, | ||
| model_name=model_name, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should call a more generic function that just gets the latests forecasts and then filter the output of that by name. I think a DBClient function like that would be more generally useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by DBClient function here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, and extra route on the DBClient interface that is get_latest_forecasts or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see, so its basically renaming this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, renaming it, and changing it's function. From what I can tell, what this function actually does in it's current state is
- get the latest forecast to run for the given location (and optional forecaster name)
- return the initilisation time of that forecast
So get_latest_forecast_init_time_for_location is a more accurate name, but, as I've described, you don't need a new db call for this, as that metadata is already returned when you get a forecast by the data platform. As such I think this function should be completely overhauled to just be a more generic get_latest_forecasts function.
| if include_metadata: | ||
| forecast_metadata: ForecastMetadata \ | ||
| = await db.get_forecast_metadata(location_uuid=national_location_uuid, | ||
| model_name=model_name, | ||
| authdata=auth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole thing could be removed as the value can be got from the get_predicted_solar_power_production_for_location call below (if the previoulsy mentioned changes are made)
Pull Request
Description
Move uk-national over to this quartz-api
Currently
#147
How Has This Been Tested?
dont do in this PR: add end to end test for uk-national with the app #179
dont do in this PR: add seperate UAT in airflow that tests it
Checklist: